EXECUTING NESTED PARALLEL LOOPS ON SHARED - MEMORYMULTIPROCESSORSSadun

نویسندگان

  • Sadun Anik
  • Wen-mei W. Hwu
چکیده

Cache-coherent, bus-based shared-memory multiprocessors are a cost-eeective platform for parallel processing. In scientiic parallel applications, most of the computation involves processing of large multidimensional data structures which results in a high degree of data parallelism. This parallelism can be exploited in the form of nested parallel loops. Most existing shared memory multiprocessors exploit this multi-level parallelism at only one level. In this paper, we explore eecient algorithms and models for executing nested parallel loops and present a simulation based performance comparison of diierent techniques using real application traces. We show that it is possible to exploit the parallelism in nested parallel loops with the use of good scheduling and synchronization algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Executing Nested Parallel Loops on Shared-Memory Multiprocessors

Cache-coherent, bus-based shared-memory multiprocessors are a cost-e ective platform for parallel processing. In scienti c parallel applications, most of the computation involves processing of large multidimensional data structures which results in a high degree of data parallelism. This parallelism can be exploited in the form of nested parallel loops. Most existing shared memory multiprocesso...

متن کامل

A hybrid scheme for efficiently executing nested loops on multiprocessors

Wang, C.-M. and S.-D. Wang, A hybrid scheme for efficiently executing nested loops on multiprocessors, Parallel Computing i 8 (! 992) 625-637. In this paper, we address the problem of scheduling parallel processors for efficiently executing nested loops. The goal is to achieve optimal load-balancing by using a few scheduling and cc, mmunication operations as possible. For this purpose, we propo...

متن کامل

A Scheme for Detecting the Termination of a Parallel Loop Nest

One central problem in the execution of parallel nested loops with non-aane bounds is the precise scanning (i.e., enumeration) of the points in their iteration space and the detection of their termination. Scanning schemes have been proposed for both shared-memory and distributed-memory implementations. However, these schemes work only for perfectly nested while loops. We propose a scheme which...

متن کامل

Architectural and Software Support for Executing Numerical Applications on High Performance Computers By

Numerical applications require large amounts of computing power. Although shared memory multiprocessors provide a cost-e ective platform for parallel execution of numerical programs, parallel processing has not delivered the expected performance on these machines. There are two crucial steps in parallel execution of numerical applications: (1) e ective parallelization of an application and (2) ...

متن کامل

Simple Code Generation for special UDLs

This paper focuses on transforming sequential perfectly nested loops into their equivalent parallel form. A special category of FOR nested loops is the uniform dependence loops (UDLs), which yield efficient parallelization techniques. An automatic code generation tool for shared and distributed memory machines, has been developed in order to automatically parallelize these perfectly nested loop...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1992